Loading a Custom Dataset
1. The Dataset
The dataset can be downloaded by clicking here and see the plugin documentation here.
The dataset describes a pH measuring experiment: Samples are taken from a solution every hour and the pH-value is measured. The results are placed in a Microsoft Excel file as a datamatrix with a sample on each row.
We would like to be able to upload this file format to Scifeon and create entities for the samples and the experiment.
2. Function Defining Decorators
There are many ways to customize your Scifeon instance. To tell Scifeon what exactly we are modifying, we set a decorator at the beginning of the script.
A decorator is a wrapper for your code and it tells Scifeon where and how to use your script.
There are currently two essential decorators: The @route
decorator, which defines a new page on the website (more on that later), and the @scifeonPlugin
decorator which adds additional functionality to already existing pages in Scifeon (a "plugin").
A simple plugin is defined below:
import { PLUGIN_TYPE, scifeonPlugin } from '@scifeon/plugins';
@scifeonPlugin({
name: "DemoSampleDataLoader",
type: PLUGIN_TYPE.DATA_LOADER,
})
The @scifeonPlugin
is imported and added to the beginning of the file. The decorator takes a number of parameters, but mandatory ones are a name and a type. The type decides what kind of plugin we are making. The list of possible plugins are available in the PLUGIN_TYPE
interface which is also imported to the file.
As we are making a data loader, we use the PLUGIN_TYPE.DATA_LOADER
type.
Another important parameter for the decorator is the match
parameter. This is a method which decides whether a given plugin is available or not. As we are making a data loader plugin, we (ultimately) only want it to show up when we are uploading a dataset in our specific format. Leaving it out will make it evaluate true
no matter the dataset type, which is fine while we are simply trying out how to make the data loader.
Let's start by making a class with an init()
method:
export class DemoSampleDataLoader {
init() {
}
}
The init()
method is part of the Aurelia (the software used for the Scifeon framework) lifecycle and is called as soon as the element is loaded.
3. Accessing File Data
To access the file data, we are using Scifeons data upload page. Open your Scifeon instance and click the button with a levitating arrow in the side panel:
You can then upload all kinds of things to Scifeon by drag-and-dropping the files onto the marked area. Or you can browse your system by clicking the Select..
button.
Once the files are selected for upload, their data will be available to your data loader. To read them, import the FileContext
plugin:
import { FileContext } from '@scifeon/plugins';
The file data can now be accessed and, for instance, be printed to the console by changing the init()
method in your data loader a little:
init(context: FileContext) {
console.log(context)
}
The context
object contains information about the file(s) selected for upload. This is both meta data such as creation date and file size, and the data in the file.
In the case of an Excel file, information on the data cells can be found in the wb
(workbook) property of the fileInfo
object.
4. Workbook Data
The workbook property contains information on the Excel file marked for upload. To reach the data matrix itself, we open the Sheets
property. This property contains a list of the worksheets in the file. If you are using the given demo file, there should be a single worksheet called ScifeonDemoSamples
.
This property contains all of the cells in the worksheet in several representations. To access the data of, for instance, cell D3, type ...Sheets.ScifeonDemoSamples["D3"].v
where the triple dots represent the rest of the property chain. The .v
property contains the cell in its Excel data type (integer, text, etc.) which is directly translated to the TypeScript equivalent.
5. Iterating through the Cells
To access each cell of the sheet, we then iterate through all of the properties of the ScifeonDemoSamples
object.
We used the !ref
property which contains the border values for the Excel sheet to create an upper bound on a for-loop running through each data row.
const limit = sheet["!ref"].split(":")[1].match(/\d+/);
for (let i=3; i<=limit; i++) {
...
The !ref
property contains both letters and numbers. Since we already know the number of columns for our data, we only extract the number of rows with the regex expression "/\d+/".
We can then iterate through row 3 to limit
, accessing each data point.
6. The Scifeon Datamodel
Before we start writing to the database, let's take a step back and think about what kind of data we want saved and how it translates to the Scifeon datamodel.
Scifeon consists of predefined entity classes all taking different parameters. A closer look at these can be seen on the datamodel page (requires a running Scifeon instance) in Scifeon.
For the experimental dataset we would like to create entities in Scifeon, both for the samples defined by the set, but also an experiment entity with a laboratory step for the samples to belong to.
Moreover, samples are a representation of a laboratory entity, moving from step to step, and do not contain result values. Instead, we generate a result set entity with result values to save these.
Thus, we end up with the following list of entities we wish to create for each experiment:
- 1 Experiment
- 1 Step
- 1 ResultSet
- 1 Sample for each row in the Excel file
- 1 ResultValue for each row in the Excel file
7. Generating Scifeon Entities
We make a list of entities that will be saved to the database later. We can then push the single elements as we create them. Here's an example of how to make the experiment and step entities:
this.entities.push({
eClass: "Experiment",
id: "#expID",
type: "DemoExperiment"
})
this.entities.push({
eClass: "Step",
id: "#stepID",
experimentID: "#expID"
})
The class of the entity is decided by the eClass
property. Different classes have different mandatory properties, but an ID is required for all entities. To make ID generation easier, Scifeon has automated this with the "#" notation. This notation allows us to simply pass a string such as "#expID" to the database, and Scifeon will automatically update it to an eligible ID.
This allows us to generate relations between entities. A step entity, for instance, requires an experimentID
. By using the same ID notation for this property as was used for the experiment ID, they will match in the database aswell.
To generate the samples, we loop through the rows of the Excel sheet:
for (let i=3; i<=limit; i++) {
this.entities.push({
eClass: "Sample",
id: "#sampleID" + i,
name: sheet["A"+i].v,
...
})
this.entities.push({
eClass: "ResultValue",
id: "#valueID",
valueText: sheet["D"+i].v,
...
})
}
The "ResultValue" class has several ways to save the data depending on the data type and the preference of the developer. Here we simply use the valueText
field. If you want to connect the result values to the samples, you can use the subjectID
and subjectClass
properties.
Remember to also create a "ResultSet" entity connecting the step entity with the result values.
Once we have generated all of the required entities, our entity list contains 43 entities.
8. Saving Entities to the Database
Congratulations! You've made all of the entities necessary to generate a complete experiment with results in Scifeon.
To save the entities to the database we use a special function that interact with the Aurelia framework to trigger when the upload button is pushed.
The function looks like this:
getResult() {
return {entities: this.entities}
}
The list we made earlier is wrapped in an object which is sent to the upload functionality within Scifeon. If it works, you should see something like the following picture when you click the button: